---
title: "MCP Evals"
description: "Test your MCP server's performance in different environments"
icon: "vial"
---

Your users are connecting to your MCP server from different clients like Claude Desktop, Cursor, etc, and with different LLMs. MCP evals ensures that your MCP server works across all environments.

## E2E testing

We built a CLI that performs MCP evals and End to End (E2E) testing. The CLI creates a simulated end user's environment and tests popular user flows.

An example of E2E test for PayPal MCP:

1. Connect the PayPal MCP server to testing agent. To simulate Claude Desktop, we can configure the agent to use a Claude model with a default system prompt.
2. Query the agent to run a typical user query like "Create a refund for order ID 412"
3. Let the testing agent run the query.
4. Check the testing agents' tracing, make sure that it called the tool `create_refund` and successfully created a refund.

## Quick Start

### Install

```bash
npm install -g @mcpjam/cli
```

### Set up tests

To set up, create a new folder directory. In that directory, we'll create a test file and an server connection file.

#### Test file

- `prompt` is what a user would type in the chat to interact with your server.
- `expectedTools` is what tools you'd expect to be called given the prompt
- Customize the environment with `model` and optional `advancedConfig`

```json weather-tests.json
{
  "tests": [
    {
      "title": "Test weather tool",
      "prompt": "What's the weather in San Francisco?",
      "expectedTools": ["get_weather"],
      "model": { "id": "claude-3-5-sonnet-20241022", "provider": "anthropic" },
      "selectedServers": ["weather-server"],
      "advancedConfig": {
        "instructions": "You are a helpful weather assistant",
        "temperature": 0.1,
        "maxSteps": 5,
        "toolChoice": "auto"
      }
    }
  ]
}
```

### Server connection file

This file is configured very similar to a `mcp.json` file. You must provide at least one `providerApiKey`.

```json local-dev.json
{
  "mcpServers": {
    "weather-server": {
      "command": "python",
      "args": ["weather_server.py"],
      "env": {
        "WEATHER_API_KEY": "${WEATHER_API_KEY}"
      }
    },
    "api-server": {
      "url": "https://api.example.com/mcp",
      "headers": {
        "Authorization": "Bearer ${API_TOKEN}"
      }
    }
  },
  "providerApiKeys": {
    "anthropic": "${ANTHROPIC_API_KEY}",
    "openai": "${OPENAI_API_KEY}",
    "deepseek": "${DEEPSEEK_API_KEY}"
  }
}
```

### Run MCP Eval

```bash
mcpjam evals run --tests weather-tests.json --environment local-dev.json
```

#### Short flags

```bash
mcpjam evals run -t weather-tests.json -e local-dev.json
```

#### CLI Options

- `--tests, -t <file>`: Path to the tests configuration file (required)
- `--environment, -e <file>`: Path to the environment configuration file (required)
- `--help, -h`: Show help information
- `--version, -V`: Display version number
